-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tables): add basic table implementation #11
Conversation
// https://iceberg.apache.org/spec/#iceberg-table-spec | ||
type Metadata interface { | ||
// Version indicates the version of this metadata, 1 for V1, 2 for V2, etc. | ||
Version() int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe better to name this FormatVersion()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already using Version()
as the method for the manifest files, so I was maintaining consistency here. If we want to rename this as FormatVersion()
we should do the same for manifest files and anywhere else we are exposing the format version via method, thoughts?
SnapshotByID(int64) *Snapshot | ||
// SnapshotByName searches the list of snapshots for a snapshot with a given | ||
// ref name. Returns nil if there's no ref with this name for a snapshot. | ||
SnapshotByName(name string) *Snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, I think this should be plural
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, it's singular because it only returns 1 snapshot, not multiple. Just to clarify my understanding of iceberg a bit, is it possible / legal for there to be multiple snapshots with the same ref name? or will it always be exactly 1 or 0 snapshots?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be either 0 or 1 snapshot that is returned. In my head I was thinking about snapshotsById
that we use in the Java impl, so having singular here is fine I think
// Metadata for an iceberg table as specified in the Iceberg spec | ||
// | ||
// https://iceberg.apache.org/spec/#iceberg-table-spec | ||
type Metadata interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main thing that's currently missing is the handling of fresh IDs for table metadata, similarly to how it's handled in Java: https://github.com/apache/iceberg/blob/2e291c2b67a643ffe93e139483df55f3639cc39d/core/src/main/java/org/apache/iceberg/TableMetadata.java#L105-L149. However, it's probably ok to handle this in a follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I didn't include a NewMetadata
/ MetadataBuilder
here, right now it only parses/unmarshals JSON metadata. I planned for a follow-up to include that. If that's okay
func (t Table) Equals(other Table) bool { | ||
return slices.Equal(t.identifier, other.identifier) && | ||
t.metadataLocation == other.metadataLocation && | ||
reflect.DeepEqual(t.metadata, other.metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so comparing metadata directly might be tricky due to Schema equality. You might want to look at the Java impl on how Schema comparison is performed (basically the schema ID is not taken into account, because a schema ID can be re-assigned)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can go ahead and merge this PR, but this is something to keep in mind for a follow-up
d242629
to
a4e6037
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @zeroshade
Basic table/metadata/snapshots implementation
CC @rdblue @nastra @Fokko @coded9 @bitsondatadev